G-SOMO: An oversampling approach based on self-organized maps and geometric SMOTE

نویسندگان

چکیده

Traditional supervised machine learning classifiers are challenged to learn highly skewed data distributions as they designed expect classes equally contribute the minimization of cost function. Moreover, design expects equal misclassification costs, causing a bias for overrepresented classes. Different strategies have been proposed correct this issue. The modification set has become common practice since procedure is generalizable all classifiers. Various algorithms rebalance distribution through creation synthetic instances were in past. In paper, we propose new oversampling algorithm named G-SOMO. identifies optimal areas create artificial an informed manner and utilizes geometric region during generation process increase their variability. Our empirical results on 69 datasets, validated with different metrics against benchmark commonly used methods show that G-SOMO consistently outperforms competing methods. Additionally, statistical significance our established.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Geometric SMOTE: Effective oversampling for imbalanced learning through a geometric extension of SMOTE

Classification of imbalanced datasets is a challenging task for standard algorithms. Although many methods exist to address this problem in different ways, generating artificial data for the minority class is a more general approach compared to algorithmic modifications. SMOTE algorithm and its variations generate synthetic samples along a line segment that joins minority class instances. In th...

متن کامل

Oversampling for Imbalanced Learning Based on K-Means and SMOTE

Learning from class-imbalanced data continues to be a common and challenging problem in supervised learning as standard classification algorithms are designed to handle balanced class distributions. While different strategies exist to tackle this problem, methods which generate artificial data to achieve a balanced class distribution are more versatile than modifications to the classification a...

متن کامل

study of hash functions based on chaotic maps

توابع درهم نقش بسیار مهم در سیستم های رمزنگاری و پروتکل های امنیتی دارند. در سیستم های رمزنگاری برای دستیابی به احراز درستی و اصالت داده دو روش مورد استفاده قرار می گیرند که عبارتند از توابع رمزنگاری کلیددار و توابع درهم ساز. توابع درهم ساز، توابعی هستند که هر متن با طول دلخواه را به دنباله ای با طول ثابت تبدیل می کنند. از جمله پرکاربردترین و معروف ترین توابع درهم می توان توابع درهم ساز md4, md...

Novel Approach for Speech Recognition by Using Self – Organized Maps

The method of self-organizing maps (SOM) is a method of exploratory data analysis used for clustering and projecting multi-dimensional data into a lower-dimensional space to reveal hidden structure of the data. The Self-Organizing Feature Maps (SOFMs) [11] is a class of neural networks capable of recognizing the main features of the data they are trained on. There is extensive literature on its...

متن کامل

An Agent-Based Approach to Self-organized Production

The chapter describes the modeling of a material handling system with the production of individual units in a scheduled order. The units represent the agents in the model and are transported in the system which is abstracted as a directed graph. Since the hindrances of units on their path to the destination can lead to inefficiencies in the production, the blockages of units are to be reduced. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Expert Systems With Applications

سال: 2021

ISSN: ['1873-6793', '0957-4174']

DOI: https://doi.org/10.1016/j.eswa.2021.115230